Analyzing In-Memory Hash Joins: Granularity Matters
نویسندگان
چکیده
Predicting the performance of join algorithms on modern hardware is challenging. In this work, we focus on mainmemory no-partitioning and partitioning hash join algorithms executing on multi-core platforms. We discuss the main parameters impacting performance, and present an effective performance model. This model can be used to select the most appropriate algorithm for different input data-sets for current and future hardware configurations. We find that for modern systems an optimized no-partition hash join often outperforms an optimized radix partitioning hash join.
منابع مشابه
Cache-Oblivious Hash Joins
Partitioning has been used to improve the performance of the hash join in the main memory; however, cache-conscious partitioning requires the knowledge about the cache parameters, such as the capacity and unit size, of a chosen level of the CPU caches, e.g., the L2 cache. Obtaining this knowledge and subsequently tuning the algorithm may be inconvenient, and sometimes infeasible, for complex sy...
متن کاملMemory-Efficient Hash Joins
We present new hash tables for joins, and a hash join based on them, that consumes far less memory and is usually faster than recently published in-memory joins. Our hash join is not restricted to outer tables that fit wholly in memory. Key to this hash join is a new concise hash table (CHT), a linear probing hash table that has 100% fill factor, and uses a sparse bitmap with embedded populatio...
متن کاملAn Adaptive Hash Join Algorithm for Multiuser Environments
As main memory becomes a cheaper resource, hash joins are an alternative to the traditional methods of performing equi-joins: nested loop and merge joins. This paper introduces a modified, adaptive hash join method that is designed to work with dynamic changes in the amount of available memory. The general idea of the algorithm is to regulate resource usage of a hash join in a way that allows i...
متن کاملMemory-Contention Responsive Hash Joins
In order to maximize system performance in environments with fluctuating memory contention, memory-intensive algorithms such as hash join must gracefully adapt to variations in available memory. Mixed workloads, creating fluctuations of erratic frequency and magnitude, make responsiveness to memory contention particularly important. Previous studies on adaptable hash joins have focused on lower...
متن کاملOn a Three-Way Hash Join Algorithm
We develop hash-based algorithms for computing a three-way join. The method involves hashing all three relations into buckets, and then joining buckets in main memory, three buckets at a time. Comparing to two-cascaded hash joins, the algorithms avoid materializing an intermediate result. We present a cost model for this approach, from which we identify the range of parameters for queries that ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017